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ON  NONGAUSSIAN  SIGNAL  DETECTION  AND  CHANNEL  CAPACITY 


Charles  R.  Baker' 

Department  of  Statistics 
University  of  North  Carolina 
Chapel  Hill,  NC  27514 

INTRODUCTION 

This  paper  contains  a  discussion  of  some  recent  research  in  signal  detection  and 
communications.  No  proofs  are  included;  the  emphasis  is  on  motivation  and  results. 
Some  precise  definitions  and  results  are  contained  in  the  Appendix.  The  paper  is 
couched  in  terms  of  problems  in  underwater  acoustics;  as  will  be  seen,  the  models  and 
results  are  of  general  applicability. 

SIGNAL  DETECTION 

Two  problems  will  be  discussed.  The  first,  for  which  the  most  complete  results 
were  obtained,  was  that  of  detecting  nonGaussian  signals  in  Gaussian  noise.  The 
second,  motivated  by  examination  of  noise  properties  for  actual  sonar  data,  was  that  of 
detection  in  a  class  of  nonGaussian  processes,  which  can  be  regarded  as  mixtures  of 
Gaussian  processes. 

Both  of  these  problems  are  important  in  sonar  applications.  The  detection  of 
nonGaussian  signals  in  Gaussian  noise  can  be  regarded  as  the  canonical  problem  for 
active  detection  in  reverberation-limited  noise,  especially  volume  reverberation.  The 
noise  in  such  situations  can  frequently  be  regarded  as  arising  from  reflections  by  many 
small  scatterers,  which  can  be  reasonably  assumed  to  have  statistically-independent 
behavior.  The  central  limit  theorem  then  gives  a  Gaussian  process.  The  signal  process, 
however,  will  frequently  be  dominated  by  reflections  from  a  few  large  scatterers,  such  as 
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the  sonar  dome.  These  scatterers  each  give  rise  to  a  nonGaussian  random  process,  which 
are  summed  at  the  receiver  to  give  a  nonGaussian  process. 

Other  applications  may  also  involve  detection  of  nonGaussian  signals  in  Gaussian 
noise.  For  example,  an  emerging  passive  sonar  detection  problem  is  that  of  detecting 
very  quiet  submarines,  emanating  primarily  broadband  signals.  These  signals  may  prove 
to  be  nonGaussian  and  one  will  frequently  be  faced  with  detecting  them  in  a  Gaussian 
noise  background. 

The  importance  of  the  problem  of  detecting  signals  in  nonGaussian  noise  of  a 
“spherically-invariant”  (Gaussian  mixture)  type  has  become  apparent  by  examining  the 
results  of  data  analysis  on  acoustic  recordings  obtained  from  both  under-ice  and 
shallow-water  environments.  These  noise  recordings  have  exhibited  data  whose  univari¬ 
ate  distribution  properties  appear  similar  to  those  of  Gaussian  random  variables  (sym¬ 
metric,  unimodal,  smooth).  However,  when  compared  with  zero-mean  Gaussian  random 
variables  having  the  same  variance,  the  data  often  exhibits  heavy  tails  and/or  high  kur- 
tosis.  These  features,  at  least  in  the  univariate  case,  are  very  appropriate  to  a  Gaussian 
mixture  model  for  the  noise.  If  the  multivariate  data  has  a  Gaussian  mixture  distribu¬ 
tion,  then  the  modeling  problem  (in  the  context  of  modeling  nonGaussian  processes)  is 
greatly  simplified,  as  will  be  discussed  below. 

DETECTION  OF  NONGAUSSIAN  SIGNALS  IN  GAUSSIAN  NOISE 

Our  objective  here  was  to  give  a  complete  solution  of  the  detection  problem.  The 
actual  data  processes  appear  as  functions  of  continuous  time.  The  desirable  results  then 
include  the  following: 

(1)  Characterization  of  signal-plus-noise  processes  for  which  the  detection  problem  is 
well  defined.  By  this,  we  mean  a  mathematical  model  which  does  not  promise  per¬ 
fect  (singular)  detection.  Such  singular  models  are  not  considered  to  be  realistic. 
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(2)  For  well-defined  problems,  determination  of  the  likelihood  ratio  for  the  continuous¬ 
time  problem. 

(3)  Approximation  of  the  continuous-time  likelihood  ratio  by  a  discrete-time  form, 
preferably  in  recursive  or  near- recursive  form. 

(4)  Specification  of  procedures  for  estimating  parameters  appearing  in  the  approxima¬ 
tion  to  the  likelihood  ratio. 

(5)  Performance  evaluation  of  the  approximation  to  the  likelihood  ratio. 

Of  course,  there  are  other  desirable  results,  such  as  development  of  robust  approxi¬ 
mations  to  the  likelihood  ratio,  and  approximations  which  do  not  require  a  full  descrip¬ 
tion  of  data  parameters.  However,  the  results  listed  in  (l)-(5)  above  are  already  very 
ambitious,  and  obtaining  them  would  be  a  significant  step  in  any  complete  solution  to 
the  proDlem. 

Considerable  work  has  been  done  on  detection  of  Gaussian  signals  in  Gaussian 
noise;  see  for  example  some  of  the  references  given  in  [5).  However,  previous  work  on 
detection  of  nonGaussian  signals  in  Gaussian  noise  has  been  subject  to  one  or  more  of 
the  following  limitations: 

(a)  The  noise  is  assumed  to  be  the  Wiener  process;  see,  e.g.,  [14).  The  paths  of  the 
Wiener  process  are  far  too  irregular  to  reasonably  model  sonar  noise.  Other  Wiener 
process  properties,  such  as  independent  increments,  Markov,  etc.,  are  not  typically 
satisfied.  Moreover,  determination  of  likelihood  ratio  parameters  is  left  as  am  open 
problem. 

(b)  Detection  is  based  on  second-moment  criteria,  such  as  the  deflection  criterion  [1], 

[21. 

(c)  Signal  and  noise  are  taken  to  be  independent  [3),  and  expressions  for  the  likelihood 
ratios  are  not  obtained. 

We  have  obtained  a  reasonably  complete  solution  to  problems  (l)-(3)  above.  The 
solutions  to  (4)  and  (5)  are  presently  being  computationally  investigated.  Here  we  give  a 
rough  summary  of  the  results.  More  precise  statements  require  a  substantial  mathemati¬ 
cal  machinery;  reference  is  made  to  [8j  for  the  complete  and  final  statements;  partial 
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results  are  contained  in  the  Appendix. 

The  data  is  assumed  to  be  observed  over  a  finite  interval,  which  we  take  as  [0,1  j. 
The  noise  is  Gaussian,  mean-square  continuous,  zero-mean,  and  is  assumed  to  vanish 
(almost  surely)  at  t=0. 

To  solve  problem  (1)  mentioned  above,  one  would  wish  to  consider  a  general 
nonGaussian  process  (/,),  and  determine  necessary  and  sufficient  conditions  for  the 
detection  problem  to  be  well-defined  (non-singular).  Such  conditions  are  given  in 
Theorem  1  and  Theorem  2  of  the  Appendix.  Roughly,  they  require  that  the  process  (Yt) 
have  a  signal- plus- noise  representation  Yt  =  S,  ■+■  .V, ,  where  the  sample  paths  of  (S,) 
belong  almost  surely  to  the  reproducing  kernel  Hilbert  space  of  (AT, ).  This  condition 
means  that  the  covariance  function  (resp.,  sample  paths)  of  the  signal  process  must  be 
much  smoother  than  the  covariance  (resp.,  sample  paths)  of  the  noise.  The  process  (St) 
must  also  satisfy  certain  measurability  conditions  with  respect  to  (T, )  and  (Nt).  See  the 
Appendix  for  the  precise  statements.  We  remark  that  the  necessary  conditions  and  the 
sufficient  conditions  are  not  identical,  although  they  are  very  close. 

Problem  (2)  mentioned  above  is  that  of  determining  the  continuous-time  likelihood 
ratio.  A  general  solution  has  been  obtained,  and  is  given  in  (8).  Actually,  two  solutions 
are  given  there.  One  views  the  observations  as  being  simply  real-valued  functions;  the 
other  treats  them  as  being  elements  of  Z,2(0,lj.  The  latter  is  summarized  in  the  Appen¬ 
dix.  Here  we  shall  give  the  finite-sample  discrete-time  approximation  to  the  likelihood 
ratio  on  Z.2[0,1|. 

First,  the  noise  is  a  Gaussian  vector  having  covariance  matrix  B-  We  can 
represent  B  by  £  =  fFF  *  where  £  is  a  lower-triangular  matrix,  and  r  is  the  sampling 
interval.  We  can  thus  consider  the  noise  process  to  be  a  sampled  version  of 

t 

Nt  =  J  F(t  ,s  )dWt ,  where  ( W,)  is  the  standard  Wiener  process.  According  to  the  results 

o 
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of  [8],  the  signal-plus-noise  process  will  be  of  the  form  5,  4-  N,  =  jF(t  ,s  )dZ, ,  where  (Z, ) 

0 

t 

is  here  taken  to  be  a  diffusion  with  memoryless  drift  function  <r:  Zt  —  f<y,(Z,)da  +  Wt . 

o 

The  resulting  discrete- time  approximation  to  the  log- likelihood  ratio  is  then 

A—  (X*+1)  =  A"  ( X ' )  -  ^2<r,  'SriLE*-lX' ).  |(£,'+ 1  A'*+1).+1 

-  »  >  l; 

X(F)  =  o 

where  X *  denotes  the  observation  vector  obtained  from  the  first  n  samples; 
r  £.  £,*  is  the  noise  covariance  matrix  for  the  first  n  sample  times; 

I 

L.  is  the  summation  matrix:  (LX* ),  =  ^  X*. 

y-i 

This  formulation  of  the  log-likelihood  ratio  is  partially  recursive.  Note  that 

(L£.-lX' ).  =  a£.-U-‘).-i  +  (LTlX9 ). , 

and  that  the  operation  (E,~lX9 ).  is  just  a  cross-correlation  of  the  data  vector  with  the 
n14  row  of 

There  are  three  basic  considerations  in  evaluating  the  usefulness  of  the  above  log- 
likelihood  ratio.  One  is  the  validity  of  the  approximation  assumption;  a  second  is  the 
development  of  procedures  for  estimating  the  parameters  of  the  likelihood  ratio;  finally, 
one  is  interested  in  whether  or  not  the  discrete-time  approximation  is  in  fact  a  likelihood 
ratio  when  our  assumptions  are  satisfied.  We  discuss  these  three  points  below. 

(i)  Any  Gaussian  vector  can  be  obtained  by  passing  white  Gaussian  noise  through  an 
appropriate  lower-triangular  matrix.  Thus,  the  noise  model  is  reasonable  for  the 
discrete-time  problem,  and  one  can  justify  the  use  of  multiplicity  M=1  from  this 
and  from  other  mathematical  considerations.  The  fact  that  (Zt)  is  a  process  of 
diffusion  type  then  follows  from  well-known  results  [14];  to  assume  further  that  it  is 
of  diffusion  type  with  respect  to  ( W, ),  one  reasons  that  the  difficult  detection  prob¬ 
lems  are  of  most  interest;  such  problems  are  those  in  which  the  S  and  S+N 
processes  have  very  similar  properties.  Since  (Nf)  is  modeled  as  a  time-varying 
linear  operation  on  the  diffusion  ( VV, ),  it  seems  reasonable  to  model  (T, )  as  that 


VF,  *>"  V 


r"v  '  ‘  '  ' V  v  w  V  W 1 

'Vs 
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same  time-varying  linear  operation  on  a  process  that  is  of  diffusion  type  with 
respect  to  (W, ).  More  detailed  physical  interpretations  of  the  assumptions  can  be 
given  for  applications  in  sonar.  However,  a  basic  reason  for  making  the  assump¬ 
tions  is  that  they  permit  one  to  implement  an  approximation  to  the  likelihood  ratio 
without  detailed  knowledge  of  the  data  probability  distributions.  The  validity  of 
these  assumptions  and  the  effectiveness  of  the  finite-sample  discrete-time  approxi¬ 
mations  can  be  judged  in  each  application  by  the  performance  of  the  detection 
algorithm. 

fii)  The  implementation  of  the  sequence  of  test  statistics  (A*)  given  above,  for  it  <  n  , 
requires  knowledge  of  only  two  parameters:  the  lower- triangular  matrix  £»  such 
that  rtFm  F'  is  the  n  Xn  noise  covariance  matrix,  and  the  drift  function  a.  Typi¬ 
cally  (in  sonar)  these  quantities  will  need  to  be  estimated  from  experimental  data. 
We  give  a  procedure  for  doing  this,  supposing  that  one  has  an  ensemble  of  indepen¬ 
dent  sample  vectors  from  the  noise  process,  of  sufficient  size  to  give  a  good  estimate 
of  the  covariance  matrix,  and  that  one  or  more  sample  vectors  from  the  signal-plus- 
noise  process  is  available. 


First,  the  noise  vector  is  written  as  H  =  £  AW.  where  AW  is  the  vector  with  ;'** 
component  ( W \j  n-W[(j -l)ri)  for  j  >  2,  and  first  component  W(r).  If  »'r=  £,.  and 
the  ij  element  of  £  is  F{i  ,j)  for  all  tj ,  then  /V,-  =  Ar(ir)  =  /V  ( £,)  for  large  i  and 
small  t.  The  representation  for  &  gives  noise  covariance  matrix  Bn  =  FFF * . 
Consistent  with  this  representation  and  the  results  of  (8),  the  S  +N  vector  is  writ¬ 
ten  as  X  —  F  A  Z  .  where  AZ  is  the  vector  with  jtk  component  [Z  [j  i]-Z  [[j for 
j  >  2,  and  first  component  Z(r).  Thus,  given  an  ensemble  of  sample  noise  vectors, 
one  treats  the  resulting  estimate  of  the  noise  covariance  matrix  as  Bn,  and  obtains 
the  factorization  Bn  —  r*EE'  ■  Then,  given  a  X  sample  vector,  £  is  estimated  by 
OlZ  =  £~‘X  and  (AZ)i  =  Z(r).  Given  £,  our  assumptions  yield 

ir 

Zi  =  Z(ir)  —  f  <r,(Z,)d»  +■  W (i  t) 
o 

Various  methods  can  then  be  used  to  estimate  the  unknown  function  <r.  A  general 
maximum-likelihood  estimate  is  given  in  [ll],  which  is  now  being  computationally 
investigated.  The  generality  of  this  procedure  leads  to  computational  difficulties,  so 
that  it  may  be  necessary  to  assume  a  specific  form  for  <r,  such  as  a  low-order  poly¬ 
nomial  with  unknown  coefficients. 

(iii)  Under  our  assumptions,  A*  can  be  considered  a  “good”  approximation  to  a  likeli¬ 
hood  ratio  test  statistic  if  r  is  “small”  and  n  is  “large  ”.  These  are  not  exact  state¬ 
ments;  at  present,  we  have  no  bounds  on  performance.  Any  such  bounds  would 
involve  Bn  >  T<  and  n.  However,  if  X  is  Gaussian,  a  precise  statement  can  be 
made.  Suppose  that  =  F  AW  and  X  =  F  AZ  as  above,  and  that  («» )  and  (bt ) 
are  two  sequences  of  real  numbers  such  that 

Z*  -  E  [ay*,  +  M  +  Wk  2  <*<«, 

Z,  -  W,. 


In  this  case,  it  can  be  shown  that  exp  (A*)  is  a  monotone  function  of  dP^/dP^ 
thus  a  likelihood  ratio  test  statistic.  We  conjecture  that  this  also  holds  when  X  is 
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not  Gaussian: 

N  =  £Ai ’1,1  =  E&Z,  and 
Z*  =  S  »/(*,■)+  !<*<«. 

j-i 

with  a  non-affine. 

The  above  resuits  give  a  solution  to  two  problems  of  much  interest  in  the  theory  of 
stochastic  processes:  determining  conditions  for  a  discrimination  problem  to  be  non- 
singular,  and  determining  the  likelihood  ratio,  when  one  of  the  two  processes  is  Gaus¬ 
sian.  The  scope  of  these  problems  can  be  appreciated  by  reviewing  some  of  the  refer¬ 
ences  cited  in  [5].  The  above  approximation  to  the  log-likelihood  ratio  gives  some  hope 
of  obtaining  useful  new  detection  algorithms  for  some  important  sonar  detection  prob¬ 
lems.  The  eventual  utility,  however,  will  be  apparent  only  after  a  great  deal  of  further 
work  is  done,  especially  computational  work  involving  experimental  data. 

DETECTION  IN  NONGAUSSIAN  NOISE 

Examination  of  data  properties  by  several  investigators  has  indicated  that  sonar 
data  may  be  spherically-invariant  (a  Gaussian  mixture)  in  several  important  applica¬ 
tions.  One  such  application  is  in  under-ice  operations.  Analysis  of  such  data  has  shown 
that  the  univariate  data  typically  has  high  kurtosis  and  heavy  tails  as  compared  to 
Gaussian  data  of  the  same  variance  [10]. 

Another  environmental  situation  which  apparently  gives  rise  to  univariate 
spherically-invariant  noise  is  that  of  near-shore  operations  in  warm  climes  (e.g.r  the  Gulf 
of  Mexico).  The  nonGaussian  noise  in  this  case  is  attributed  to  snapping  shrimp  and 
results  in  very  high  kurtosis  as  compared  to  Gaussian  data  [15}. 

These  observed  data  properties  motivated  us  to  consider  the  problem  of  detection 
in  spherically-invariant  noise.  Such  a  noise  process  ( Nt )  can  be  represented  as 
N,  =  AG, ,  where  (G, )  is  a  zero-mean  Gaussian  process  with  covariance  function  R ,  and 


.4  is  a  random  variable  independent  of  (Gt).  Such  processes  are  also  said  to  be 
Gaussian-mixture  processes. 

Of  course,  the  property  of  being  univariate  spherically-invariant  does  not  imply 
that  a  process  will  be  spherically-invariant  in  the  multivariate  case,  as  consideration  of 
the  case  A  =1  will  show.  However,  if  the  above  representation  of  the  noise  is  reasonable, 
then  the  problem  of  characterizing  the  probability  distributions  for  nonGaussian  noise  is 
reduced  to  that  of  determining  the  covariance  function  of  (Gt)  and  the  probability  distri¬ 
bution  function  of  the  random  variable  A  .  Without  loss  of  generality,  one  can  assume 
that  the  second  moment  EA 2  =  1,  so  that  the  covariance  of  (Gt)  is  the  same  as  that  of 
{Nt ).  As  this  function  can  be  estimated,  the  major  problem  is  that  of  determining  the 
distribution  function  of  .4  .  We  are  presently  carrying  out  computational  work  on  this 
problem,  using  maximum-likelihood  estimation. 

The  significance  of  this  model,  if  accurate,  is  that  it  would  permit  one  to  describe 
all  the  joint  distributions  of  the  data  through  knowledge  of  the  covariance  (as  in  the 
Gaussian  case)  and  of  the  distribution  function  for  a  single  random  variable.  It  can  thus 
be  viewed  as  a  first  step  away  from  the  Gaussian  noise  hypothesis  which  does  not  require 
that  one  take  independent  samples. 

We  are  interested  here  in  obtaining  the  same  results  (l)-(5)  discussed  above  for  the 
Gaussian  noise  case.  So  far,  partial  results  have  been  obtained  for  (1)  and  (2).  We  have 
found  ’9l  that  the  sufficient  conditions  for  a  well-defined  (non-singular)  detection  prob¬ 
lem  are  the  same  as  those  obtained  for  detection  in  Gaussian  noise  (which  are  very  close 
to  being  necessary).  An  expression  for  the  continuous-time  likelihood  ratio  has  also  been 
found.  The  remaining  problems  in  obtaining  (3)-(5)  above  have  yet  to  be  seriously  inves¬ 
tigated. 


MUTUAL  INFORMATION  AND  CHANNEL  CAPACITY 
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Work  to  be  discussed  here  was  again  for  two  types  of  noise  processes:  one  where  the 
channel  noise  is  Gaussian,  the  other  where  it  is  sphericaJly-invariant. 

The  capacity  of  a  communication  channel  is  here  taken  to  be  its  information  capa¬ 
city: 


C  =  sup  I {m  ,Y) 

Q 

where  m  is  the  message  process,  A(m)  =  s  is  the  transmitted  signal,  N  is  the  channel 
noise,  Y  =  A  (m )  +  :V  is  the  received  process,  and  is  the  mutual  information 

between  stochastic  processes  u  and  v  (as  defined  in  (4j  ).  The  constraint  class  Q  con¬ 
tains  all  admissible  message  processes  m  and  coding  functions  A  .  It  is  usually  chosen 
from  considerations  involving  average  power,  so  typically  involves  a  relation  between  the 
signal  process  and  the  noise  covariance.  In  the  case  of  stationary  signal  and  noise 
processes,  with  spectral  density  functions  4>,  and  <J>,V,  an  appropriate  constraint  is 


-OO  N 


This  can  be  related  to  the  reproducing  kernel  Hilbert  space  of  the  noise  process,  and  a 
related  general  constraint  is  £||A(m  )j(£  <  P ,  where  ||«  ||N  is  the  reproducing  kernel  Hil¬ 
bert  space  (  for  N  )  norm  of  the  function  v  (t ). 

If  one  considers  this  in  physical  terms  for  the  frequency  domain,  such  a  constraint 
places  a  limitation  on  the  expected  value  of  the  integrated  ratio  of  signal  energy  to  noise 
energy.  In  non-white  noise,  this  is  obviously  more  realistic  than  a  limitation  on  total  sig¬ 
nal  energy  alone. 


MISMATCHED  CHANNELS 

With  the  type  of  constraint  discussed  above,  a  complete  solution  to  the  channel 
capacity  problem  for  Gaussian  channels  without  feedback  is  given  in  [4].  However,  this 
approach  will  not  be  valid  when  the  covariance  of  the  channel  noise  ( N, )  is  unknown. 
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This  can  occur  from  natural  causes,  as  with  insufficient  knowledge  of  the  environment. 
It  can  also  occur  because  of  jamming  in  the  channel.  In  the  latter  case,  it  is  well-known 
that  if  the  channel  noise  has  a  given  covariance,  then  channel  capacity  is  minimized 
when  the  noise  is  Gaussian.  Thus,  a  jammer  seeking  to  minimize  capacity  of  a  channel 
with  ambient  Gaussian  noise  would  choose  to  add  Gaussian  noise,  and  the  channel  capa¬ 
city  would  then  be  determined  by  the  relation  between  the  actual  channel  noise  (includ¬ 
ing  the  jammer’s  contribution)  and  the  noise  covariance  assumed  by  the  user  of  the 
channel.  Of  course,  less  obvious  questions  also  arise.  Channel  capacity  is  only  the  start¬ 
ing  point  in  analyzing  such  situations. 

These  considerations  have  motivated  us  to  introduce  the  notion  of  “mismatched” 
channels,  wherein  the  constraint  on  transmitted  signals  is  taken  with  respect  to  a  covari¬ 
ance  which  is  different  from  that  of  the  channel  noise. 

An  analysis  of  this  problem  is  contained  in  [6]  for  a  large  class  of  Gaussian  noise 
processes.  Additional  results  are  forthcoming  [7|.  Striking  differences  appear  between 
the  results  for  the  mismatched  channel  and  those  for  the  matched  channel  (when  the 
channel  noise  is  also  the  constraint  noise).  For  example,  in  the  matched  continuous-time 
channel  with  the  above  generalized  power  constraint  (£’j|A(m)|l^  <  P ),  the  capacity  of 
the  channel  is  equal  to  P  /2  and  cannot  be  actually  attained.  In  the  mismatched  chan¬ 
nel,  the  capacity  can  be  either  greater  or  smaller  than  the  capacity  for  the  matched 
channel  and  it  can  be  attained  in  some  situations.  The  value  of  the  capacity  depends  on 
the  relation  between  the  two  covariances.  We  give  one  result  from  [6]. 


Let  the  constraint  covariance  operator  in  L2{0,r]  be  denoted  by  Rw  (consider  VF  as 
the  noise  assumed  by  the  channel  user)  and  let  RN  be  the  covariance  operator  for  the 
channel  noise  process  N .  Suppose  that  R^  =  R^1  (/  +S)R where  /  is  the  identity  in 
LsjO,lj  and  S  is  a  compact  operator.  This  relation  will  be  satisfied,  for  example,  when 


■iwi\i ■*-  ■*-  -* ..--’-’f  ..'ll 


W  and  N  are  two  Gaussian  processes  for  which  the  discrimination  (detection)  problem  is 
well-defined.  Let 


C,v(P)  =  sup/[A(m),r] 

when  Q  contains  all  coding  operations  A  and  stochastic  processes  m  (including 
nonGaussian  processes)  on  '0,Tj  such  that  E  ||/t  (m  )||fr  <  P.  Finally,  let  {X„  ,n  >  1} 
denote  the  strictly  negative  eigenvalues  of  the  operator  S  defined  above.  Of  course,  this 
set  may  be  empty,  as  when  N  can  be  written  as  N  =  W  +  V,  with  V  independent  of  W . 
Let  {«„  ,n  >  1}  be  associated  o.n.  -  eigenvectors.  Then  [61: 

(a)  If  {XB  ,n  >  1}  is  not  empty  and  2JXJ  <  P,  then  CV(P)  =  i  £„log((l+X„  )'*]  + 

(b)  If  {X,  ,  n  >  1}  is  not  empty,  and  J],  |X,  |  >  P ,  then  there  exists  a  largest  integer  K 
such  that  S^X.  -fP  >  K\k,  and 

=  log 
£  *— i 

(c)  If  {X. ,  N  >  1}  is  empty,  CW(P)  =  P/2. 

(d)  In  (a)  and  (b),  the  capacity  is  strictly  greater  than  when  Rn  =  RW]  in  (c)  these 
capacities  are  equal. 

(e)  In  (a),  the  capacity  can  be  attained  if  and  only  if  2.  |X„  |  =  P .  It  is  then  attained 

by  a  Gaussian  signal  with  covariance  operator  R  —  £,--i  ft ®  «,• ,  where 
«*  —  ,  U  unitary,  and 

ft,  —  -X.(l+X,  )**  for  n  >  1.  In  (b),  the  capacity  can  be  attained  by  a  Gaussian 
signal  process  with  covariance  operator  as  above,  with  «„  =  Ut%  and 


S^X.+P+if 

AT(I+X.) 


-  12  - 


J,  =  +  P  +K\/K  for  n  <  A'; 

J.  =  0  for  n  >K .  In  (c),  the  capacity  cannot  be  attained. 

A  more  general  model  is  considered  in  [7] .  That  model  is  for  the  case  where  S  is 
not  necessarily  compact,  but  has  a  pure  point  spectrum.  The  capacity  for  the 
mismatched  Gaussian  channel  can  then  be  either  smaller  or  larger  than  that  of  the 
matched  channel,  depending  on  the  spectral  properties  of  the  operator  S . 

CAPACITY  OF  SPHERICALLY-INVARIANT  CHANNELS 

The  apparent  usefulness  of  a  spherically-invariant  process  to  model  noise  in  under- 
ice  and  shallow-water  applications  has  motivated  us  to  examine  the  channel  capacity 
problem  for  communicating  in  such  noise.  This  work  has  been  aided  by  the  work  on  sig¬ 
nal  detection  described  above;  in  fact,  the  likelihood  ratio  plays  a  key  role  in  channel 
capacity  problems. 

We  have  examined  the  problem  for  the  matched  channel,  where  the  constraint  on 
transmitted  signal  is  £|]A(m)||$  <  P ,  with  N  the  channel  noise.  As  shown  in  [4]  and 
[131,  the  capacity  for  the  matched  Gaussian  channel  with  this  constraint  is  P/2,  with  or 
without  feedback.  For  the  spherically-invariant  channel  with  noise  model  Nt  —  AGt,  A 
a  random  variable  independent  of  the  Gaussian  process  (G,),£A2  =  1,  we  have  found  the 

p 

capacity  to  be  equal  to  —  E(A '2).  E(A'2)  will  typically  be  quite  large  for  some  underwa- 

mt 

ter  acoustics  applications.  Thus,  this  result  holds  forth  the  possibility  that  one  may  be 
able  to  communicate  at  much  higher  rates  than  for  the  Gaussian  channel  with  the  same 


covariance. 
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APPENDIX 


Absolute  Continuity  and  Likelihood  Ratio 

Definitions  and  Notation 

All  stochastic  processes  are  defined  on  the  probability  space  (Q.ftP),  with  parame¬ 
ter  set  [0.1].  Rk  is  the  Borel  a- field  for  RK  ,  K  <oo,  C0[0,l]  as  CQ  is  the  set  of  all  real¬ 
valued  continuous  functions  on  0, 1  j  that  vanish  at  zero.  C  is  the  Borel  onfield  on  C0 
defined  by  the  sup  norm.  CK  is  the  Borel  cr-field  of  Cft  under  the  product  topology; 
Ct)  can  be  identified  with  the  set  of  all  /("-component  real-valued  vector  functions  hav¬ 
ing  each  component  in  C0. 

Suppose  that  (V, )  is  a  vector  stochastic  process  such  that  V(w,  )G  C$  a.e.  dP(w). 
V  will  denote  the  corresponding  path  map  from  fl  into  C *  ,  and  Pv  the  induced  meas¬ 


ure  on  CK  :Py  =  P  O  V‘». 

(N, )  will  denote  the  noise;  it  is  m.s.-continuous,  Gaussian,  zero-mean,  and  vanishes 
at  t=0  w.p.  1.  ( Nt )  is  thus  purely  deterministic,  so  has  a  proper  canonical  Cramer-Hida 
representation  [12]: 

M  * 

Nt  =  E  /Pf(i,«Wf(.)  (1) 

«-i  o 

where  M  <  oo  is  the  multiplicity  of  { Nt ),  each  F{  is  a  deterministic  Volterra  kernel,  and 
the  Bi ’s  are  mutually  orthogonal  stochastic  processes  with  orthogonal  increments.  (Af, ) 
is  Gaussian;  the  B,’s  are  thus  mutually  independent  Gaussian  processes  with  indepen¬ 
dent  increments  and  continuous  variances.  Each  B{  is  thereby  path-continuous.  Since 
the  representation  (1)  is  proper  canonical,  and  ( Nt )  and  the  family  of  B(' s  are  Gaussian, 
the  <7-fieid  generated  by  {Nt  ,u  <  » }  is  the  same  as  the  (7-field  generated  by 
{£?,(«),«  <  » ,i  <  M},  for  all  s  in  [0,1].  ft  will  denote  the  Borel  measure  on  [0,1] 
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defined  by  the  continuous  non-decreasing  variances  EB2,  0  <  s  <1. 

We  assume  that  the  multiplicity  Ai  of  ( TV, )  is  finite.  This  restriction  is  due  to  the 
absence  of  some  needed  results  in  infinite-dimensional  stochastic  calculus.  However, 
based  on  a  partial  investigation,  we  believe  that  the  results  on  absolute  continuity  and 
likelihood  ratio  presented  here  remain  valid  for  M=  oo. 

Suppose  that  ( Vt )  is  any  stochastic  process;  a(V)  is  the  P-completed  filtration  gen¬ 
erated  by  ( V, ),  and  z(V)  v  2(jV)  is  the  smallest  filtration  containing  both  z^F)  and  jzfTV). 
We  recall  that  a  process  (X,)  is  2(V)-  predictable  if  G:(i,w)  —  X,  (u)  is  measurable  with 
respect  to  the  predictable  <r-field  P(V)  in  R+  X  n;P(V)  is  generated  by  all  path- 
continuous  stochastic  processes  that  are  adapted  to  2(X). 

P.v  will  denote  the  covariance  function  of  ( N, ),  HN  its  RKHS  (reproducing  Kernel 
Hilbert  space)  with  inner  product  <-,->w,  and  RN  the  covariance  operator  of  (TV,)  in 
L  2,0,1] .  Range  (RNl/2)  is  a  separable  Hilbert  space,  isomorphic  to  HN ,  under  the  inner 
product,  («  ,g)N  =  <u  ,enXg  ,e%  >/X. ,  where  <v>  is  the  L2[0,l]  inner  product, 
(X,  ,n  >  1}  are  the  non-zero  eigenvalues  of  RN ,  and  (e.  ,n  >  1}  are  associated  o.n. 
eigenvectors. 

is  the  space  of  real-valued  functions  on  [0,1  j;  is  the  Borel  <r-field  generated 
by  the  cylinder  sets  (/  in  (/  (i  i ),...,/  (tm ))  €  A  *  },  n  <oo,  A*  a  Borel  set  in  R *  . 

For  a  scalar  stochastic  process  (Vj ),  vv  is  the  probability  induced  on  by  (V, ). 
If  (V,)  has  paths  belonging  a.s.  to  L2[0,lj,  then  will  denote  the  probability  induced  by 
the  path  map  on  the  Borel  <r-field  of  L2[0,lj.  If  v,  and  u*  are  two  probabilities  on  the 
same  <r-field,  then  t>,  «  v2  means  that  ui  is  absolutely  continuous  with  respect  to  u 2. 

ABSOLUTE  CONTINUITY 

Theorem  l  [8j:  Let  ( V, )  be  a  stochastic  process  independent  of  (TV, ).  Suppose  that  (X, )  is 
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a  process  such  that  vY  «  v*j . 

If  ( I', )  is  adapted  to  e{N)  v  slV)t  then  Y,  =  S,  +  1 Ve*  a.e.  dP  for  each  fixed  t  in 
0, 1  i ,  where  (N,')  has  the  same  finite  dimensional  distributions  as  (Nt),  and  is  adapted  to 
u  ‘ 

<r(Y).  iVt*==  ^  (*  )dB/(s  )  a.e.  dP,  each  fixed  t  in  [0,1),  where  the  B,"  s  are  mutu- 

.-10 

ally  independent  zero-mean  Gaussian  processes,  (£,*)  has  the  same  law  as  (B, ),  and 
2(5  * )  ==  2.(N‘\  Moreover, 

(2) 

i— i  o 

where  (<p,  (t)),i  <  M ,  is  a  stochastic  process  that  is  s(K)- predictable  and  has  paths  a.s. 
in 

If  both  ( Nt )  and  (7,)  have  continuous  paths,  then  Theorem  1  can  be  strengthened. 
In  that  case,  let  Pn  and  Py  be  the  induced  measures  on  C.  Then 

Vy  «  VN  <  —  >  Py  «  Pn  <~>  HY  «  PN  ■ 

Theorem  2  [8]:  Let  ( V, )  be  a  stochastic  process  independent  of  (Nt  )■  Suppose  that  (.■>, )  is 
a  stochastic  process  adapted  to  zfiV)  vj(V)  and  with  paths  a.s.  in  Hn  ■ 

(1)  If  Xt  =  S,  +  Nt  a.e.  dP,  for  each  fixed  t  in  [0,1],  then  vx  «  vN . 

(2)  If  Xt  =  St  +  Nt  a.e.  dtdP,  then  «  Pn  ■ 


Likslihaad-Baiig 

Suppose  that  (Yf)  satisfies  the  measurability  assumption  in  Theorem  1,  and  that 
vY  «  vN .  Define  a  vector  process  { Zt )  with  paths  a.s.  in  Co  by 

t 

Zi(t)  -  /  +  Bi(t) 

o 


(3) 


where  o,  is  defined  in  Theorem  1.  In  this  case.  P2  «  P3  14  . 

Theorem  3  ${:  Suppose  that  (V, )  satisties  the  suiTicient  conditions  of  Theorem  2.  Then 

P-  (X  )  =  f  'dPzjdP3  i (,  )  dPB  |.v_,  (y  ) 

d  f.v  r" 

L  0 

a.e.  duy(z),  where  PB\s=,  is  the  conditional  measure  of  B  given  ,V  =  z .  If  (S, )  is 
deiined  as  in  Theorem  1.  and  Y  —  X  —  .V,  then 

-7—“  (x  )  =  J[dPZ;dPB\(y)P(z,dy) 

d  M.V 

°0 

a.e.  where  P  is  a  transition  probability  on  LojO.l]  x  CM ,  and  P(i,)j.Ps  a.e. 

dusi1)-  Moreover,  P(z,)  is  a  point  mass  on  C<f ,  giving  probability  one  to  {”*  (y )}. 

where 

im* (y )](< )  =  E<y-e«  ></(’- e»  >/X» with 

A 

f 

/<’(«  )  =  /  )M(“  )• 

0 

From  Theorem  3,  one  can  obtain  dvY/dvN  and  d\iY  jd\iN  from  dP2/dPB.  Since 
(£?,  )  is  a  vector  process  with  components  that  are  mutually  independent  continuous-path 
Gaussian  martingales  w.r.t.  2(,V),  dPz/dPB  can  be  obtained  from  well-known  results  [141. 


